Search results for "XML database"
showing 10 items of 10 documents
Building Semantic Trees from XML Documents
2016
International audience; The distributed nature of the Web, as a decentralized system exchanging information between heterogeneous sources, has underlined the need to manage interoperability, i.e., the ability to automatically interpret information in Web documents exchanged between different sources, necessary for efficient information management and search applications. In this context, XML was introduced as a data representation standard that simplifies the tasks of interoperation and integration among heterogeneous data sources, allowing to represent data in (semi-) structured documents consisting of hierarchically nested elements and atomic attributes. However, while XML was shown most …
A novel XML document structure comparison framework based-on sub-tree commonalities and label semantics
2012
International audience; XML similarity evaluation has become a central issue in the database and information communities, its applications ranging over document clustering, version control, data integration and ranked retrieval. Various algorithms for comparing hierarchically structured data, XML documents in particular, have been proposed in the literature. Most of them make use of techniques for finding the edit distance between tree structures, XML documents being commonly modeled as Ordered Labeled Trees. Yet, a thorough investigation of current approaches led us to identify several similarity aspects, i.e., sub-tree related structural and semantic similarities, which are not sufficient…
Building Ontologies from XML Data Sources
2009
In this paper, we present a tool called X2OWL that aims at building an OWL ontology from an XML datasource. This method is based on XML schema to automatically generate the ontology structure, as well as, a set of mapping bridges. The presented method also includes a refinement step that allows to clean the mapping bridges and possibly to restructure the generated ontology.
Requirements for XML document database systems
2001
The shift from SGML to XML has created new demands for managing structured documents. Many XML documents will be transient representations for the purpose of data exchange between different types of applications, but there will also be a need for effective means to manage persistent XML data as a database. In this paper we explore requirements for an XML database management system. The purpose of the paper is not to suggest a single type of system covering all necessary features. Instead the purpose is to initiate discussion of the requirements arising from document collections, to offer a context in which to evaluate current and future solutions, and to encourage the development of proper …
An overview on XML similarity: Background, current trends and future directions
2009
In recent years, XML has been established as a major means for information management, and has been broadly utilized for complex data representation (e.g. multimedia objects). Owing to an unparalleled increasing use of the XML standard, developing efficient techniques for comparing XML-based documents becomes essential in the database and information retrieval communities. In this paper, we provide an overview of XML similarity/comparison by presenting existing research related to XML similarity. We also detail the possible applications of XML comparison processes in various fields, ranging over data warehousing, data integration, classification/clustering and XML querying, and discuss some…
Extensible User-Based XML Grammar Matching
2009
International audience; XML grammar matching has found considerable interest recently due to the growing number of heterogeneous XML documents on the web and the increasing need to integrate, and consequently search and retrieve XML data originated from different data sources. In this paper, we provide an approach for automatic XML grammar matching and comparison aiming to minimize the amount of user effort required to perform the match task. We propose an open framework based on the concept of tree edit distance, integrating different matching criterions so as to capture XML grammar element semantic and syntactic similarities, cardinality and alternativeness constraints, as well as data-ty…
XCDL: an XML-oriented visual composition definition language
2010
International audience; XML data flow has reached beyond the world of computer science and has spread to other areas such as data communication, e-commerce and instant messaging. Therefore, manipulating this data by non expert programmers is becoming imperative. On one hand, Mashups have emerged a few years ago, providing users with visual tools for web data manipulation but not necessarily XML specific. Mashups have been leaning towards functional composition but no formal languages have yet been defined. On the other hand, visual languages for XML have been emerging since the standardization of XML, and mostly relying on querying XML data for extraction or structure transformations. These…
Adopting XML for Large-Scale Information
2011
This book has presented many different ways to encode information in XML format and the purposes for doing so. In this concluding chapter we consider problems related to managing XML information assets and the methods available to address those problems. Approaches for persistently storing XML data can be divided into file storage and database storage, and the research community has been especially active in designing new solutions for XML databases. However, adoption of XML often means massive migration procedures from some legacy data into the XML format; examples of migration cases are given. While describing the problems related to adopting XML, we give examples of the kinds of data fo…
A novel policy-driven reversible anonymisation scheme for XML-based services
2015
Author's version of an article in the journal: Information Systems. Also available from the publisher at: http://dx.doi.org/10.1016/j.is.2014.05.007 This paper proposes a reversible anonymisation scheme for XML messages that supports fine-grained enforcement of XACML-based privacy policies. Reversible anonymisation means that information in XML messages is anonymised, however the information required to reverse the anonymisation is cryptographically protected in the messages. The policy can control access down to octet ranges of individual elements or attributes in XML messages. The reversible anonymisation protocol effectively implements a multi-level privacy and security based approach, s…
ProcMiner: Advancing Process Analysis and Management
2007
This paper contributes both to research and practice on process mining. Previous research on process mining has focused on mining patterns from event log files to generate process models. The process mining approach adopted in this paper is focused on producing patterns about process models, not the models themselves. The approach is demonstrated by ProcMiner -an explorative research prototype for management, consolidating, publishing, retrieving, and analyzing process models. Content-based document clustering is applied to process models represented as XML database in order to find topical groups from models. In practice, organizations face numerous challenges in managing their process mod…